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Social influence (1) drives both offline and online human behaviour. It per- 
vades cultural markets, and manifests itself in the adoption of scientific and 
technical innovations as well as the spread of social practices (|2). Prior em- 
pirical work on the diffusion of innovations in spatial regions or social net- 
works has largely focused on the spread of one particular technology (ji] Q 
among a subset of all potential adopters Q. It has also been difficult to deter- 
mine whether the observed collective behaviour is driven by natural influence 
processes, or whether it follows external signals such as media or marketing 
campaigns @. Here, we choose an online context that allows us to study social 
influence processes by tracking the popularity of a complete set of applications 
installed by the user population of a social networking site, thus capturing the 
behaviour of all individuals who can influence each other in this context. By 
extending standard fluctuation scaling methods (0[#]>, we analyse the collective 
behaviour induced by 100 million application installations, and show that two 
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distinct regimes of behaviour emerge in the system. Once applications cross 
a particular threshold of popularity, social influence processes induce highly 
correlated adoption behaviour among the users, which propels some of the ap- 
plications to extraordinary levels of popularity. Below this threshold, the col- 
lective effect of social influence appears to vanish almost entirely in a manner 
that has not been observed in the offline world. Our results demonstrate that 
even when external signals are absent, social influence can spontaneously as- 
sume an on-off nature in a digital environment. It remains to be seen whether 
a similar outcome could be observed in the offline world if equivalent experi- 
mental conditions could be replicated. 

Social influence captures the ways in which people affect each others' beliefs, feelings, and 
behaviors. It has traditionally been in the domain of social psychology with principal focus on 
micro-level processes among individuals ( 1 ), but it also plays a prominent role across the social 
sciences, for example, in the study of contagion in sociology (T21), herding behavior in economics 
(|9|), speculative bubbles in financial markets ( |70| ), voting behavior ( |77[ ), and interpersonal health 



( |72| ). Social influence plays an especially important role in cultural markets (13), for products 
such as books and music, and generally pervades any arena of life where the attitudes and tastes 
of individuals are influenced by others. 

It is often useful to distinguish between local and global sources of influence, which typi- 
cally are identified with an individual's interpersonal environment and the mass media, respec- 
tively ( [7?] ). The overall social influence arises from a mixture of local and global influences, 
which themselves emerge from different signals. The fact that these two processes operate at 
very different scales poses considerable challenges for the empirical study of social influence. 
For the purposes of our study, we define (i) local signal as information on the behavior of indi- 
viduals who are friends or acquaintances of ego, the person whose behavior is being analyzed, 
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and (ii) global signal as information on the aggregate behavior of the population. Note that 
these definitions rely on the potentially observable behaviors of others as opposed to the non- 
observable ones, such as their intentions or feelings. This framework incorporating local and 
global signals is very generic and possible behaviors range from the consumption of cultural 
products to making lifestyle choices. 

The structures of social influence are most naturally addressed from the perspective of social 



network analysis (15). The notion of local influence presupposes that individuals are embedded 
in a social network that channels and directs how behaviors spread. Examples of such behaviors 
include innovation adoption among physicians Q, as well as other empirical and theoretical 
studies of diffusion (|?][77 T8p6 ). The notion of global influence, on the other hand, presupposes 
that individuals have information on the aggregate popularity of products and behaviors. While 
a given social network can be used as a proxy for communicating the behavioral signals, one 
should ideally have access to a network that accurately represents the potential communication 
channels for a given local signal, and these channels may vary between different behaviors. In 
addition, individuals are often selective as to what information they choose to disclose to their 
friends, resulting in the local signal being necessarily incomplete, biased, or misrepresented 



( |i9[ ). Similarly, while accurate population level statistics exist for popular items, it is much 
harder to find statistics for more marginal products and behaviors. 

A novel opportunity to study human behavior in a setting that overcomes these methodolog- 
ical limitations is provided by certain online environments. These systems have the advantage 
of allowing access to complete sub-populations of agents. When combined with appropriate 
tools of analysis, they enable the direct study of collective macro-level social behavior in very 
large social systems without sampling. We study a complete online social system with well- 
defined local and global signals by harnessing data from Facebook ( [27] ), a hugely popular social 
networking site (SNS), which at the time of data collection had approximately 50 million ac- 
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tive users worldwide. In addition to the current popular interest in social networks, scholars 



have recognized the potential of these and other social websites for research (22 23 25 24 26), 
reflecting the current move to utilising rich large-scale datasets on human behavior and commu- 



nication ( |27||2£| ). Facebook users, in line with other SNSs, can construct a public or semi-public 
profile within a bounded system, articulate a list of other users, "Facebook friends", with whom 
they share a connection, and view and traverse their list of connections and those made by others 



within the system ( |29| ). 

Facebook users can also install (and uninstall) applications (Fig. 1A) that enable them, for 



instance, to play poker and compare their taste in movies with their friends (21 ). Whenever 



a user adopts a new application, her friends are automatically notified by the system (30), but 
the users can also see the applications of any of their friends simply by visiting their profile. 
Consequently, users with many Facebook friends are then, at least in principle, in a position 
to influence a larger number of other users. In addition, everyone has access at all times to 
an all-inclusive listing of applications ranked by their global popularity, which acts as an ef- 
fective "best seller" list. Although applications are free of charge, popular applications have 
the advantage of being readily discoverable (low search cost), and are more likely to be of 
higher quality both with respect to reliability (exhaustively tested) and functionality (superior 
features). The applications provide recreational value and can be seen as cultural goods, and 
the different ways the users process the local and global signals in choosing applications reflect 
their personal preferences, i.e. the underlying heterogeneity of the population. 

In addition to the distinction between local and global signals, it is important to classify 
systems into two separate categories based on whether their dynamics are endogenous without 
external drivers, or exogenous and driven externally. Epidemic spreading in a closed system 
is an example of an endogenous process with local transmission, since the pathogens need to 
be passed from one person to another in close physical proximity. Similarly, it is possible 
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Figure 1: Facebook users and applications. (A) The users (round nodes) form a social network 
(solid lines) which influences their behavior in adopting application (hexagons). (B) Number 
of users rii(t) as a function of time t for four applications of which "Texas HoldEm Poker" is 
the most popular one at the end. (C) Number of users rii(T) sorted in descending order for the 
2123 applications that have rii(T) > (Zipf plot). (D) Probability density distribution P{n{T)) 
vs. n{T) is fat-tailed. The dashed line ~ n{T)~ 2 is intended to guide the eye and corresponds 
to the limit where the mean of the distribution diverges. 
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to model the spread of innovations such as the uptake of new hybrid crops by farmers as an 
endogenous social contagion process, and to try to distinguish between different types of local 
processes that may underlie the observed rate at which the innovation is adopted ( |20| ). However, 
studies of social influence which focus on local and endogenous processes such as word-of- 
mouth transmission are almost always open to the challenge that they neglect equally important 
exogenous effects such as marketing or mass advertising, and typically trying to separate these 
two confounding factors is highly problematic. For instance, a recent re-analysis @ of the 
classic diffusion studies on how prescriptions for an antibiotic drug spread among physicians 
in different communities §3\^§ suggests that marketing efforts, in this context corresponding 
to external drivers, can account for most of the observed behavior. In the current setup, both 
the local and global signals are generated endogenously within the system, i.e. there is no 



exogenous driver ( pij ). 

We downloaded the data from Facebook for all existing 2720 applications between June 25, 
2007 and August 14, 2007, shortly after applications were introduced. These data consist of 
time series rij(t) with i — 1, 2, . . . , M — 2720 and t = 1,2, ... , T = 1208 corresponding to 
the aggregate number of users who have application i installed at time t (Fig. IB). Data for 15 
applications were partly corrupted and were consequently omitted from the analysis, leaving us 
with 2705 applications, or 99% of the data. Importantly, studying all the applications avoids a 
selection bias, which is generated by examining the trajectories of those applications that spread 
successfully as tends to be done in most studies on social influence Q. Successful products in 
cultural markets have been found to be orders of magnitude more popular than the average 



cultural product (13). This finding is also manifest in the case of Facebook applications. The 
number of users at the end of the time horizon, Ui(T), sorted in descending order is shown 
in Fig. 1C. For the ten most popular applications these numbers vary between n^(T) « 12 
million and nao)(T) ~ 4.6 million, whereas n( 100 )(T) « 180,000 and nh oo)(T) ~ 1,300. 
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The probability density distribution for the number of application installations (Fig. ID) has a 
very fat tail and decays so slowly that even its mean value diverges in the limit of infinite system 
size. 

Each new installation, in addition to increasing the overall user base of the application and 
thus its global signal, also generates a local signal, through which the adopter may in turn in- 
fluence the future behavior of his friends. Each installation thus acts as a microscopic social 
stimulus and creates a form of positive feedback in the system. Note that the observable behav- 
ior which generates patterns of social influence in this case is restricted to the adoption of an 
application, rather than its use. Given that the users are part of a very large social network, the 
consequences of adopting an application are not limited to a user's immediate neighborhood, 
but may percolate further in the network. This underlines the importance of having data that 
reflects the behavior of the entire network even if the underlying microscopic data are not avail- 
able. While the impact of a single installation is admittedly minute, the superposition of the 
observed 104 million application installations leaves behind a detectable footprint. 

To study the effect of social influence, i.e. the extent to which the behavior of an individ- 
ual (his installing an application) is related to the behavior of others (their installing the same 
application), we turn to the method of fluctuation scaling (FS). This allows us to extract a key 
signature of the system's behavior purely on the basis of the above aggregate data. FS has been 
applied successfully to a number of complex systems whose interacting elements participate in 
some dynamic process. Examples of application domains range from fluctuations in population 
sizes in ecology to fluctuations in stock trading activity in financial markets (32 7§8§. Here we 
outline how FS can be utilized in the current problem, and refer the reader to Supplementary 
Information (SI) for details. For a given application i, the act of individual j regarding installa- 
tion of the application is encoded by the random variable Sij(t), where Sij(t) = 1 corresponds 
to him installing the application at time t, and Sij (t) = corresponds to him doing nothing. 
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From the stochastic process point of view, one can think of each individual tossing coins at 
every time step, one per application, to decide whether he will install the given application. So- 
cial influence, operating through the local and global signals, is likely to render the coin tosses 
dependent for any given application (Fig. 2A,B). To measure the strength of social influence, 
we define net activity fi(t) of application % at time t as 

N N-ni{t) 

f i {t)=n i {t)-m{t-\) = Y,S i j{t)= Yl fyA), (D 

j=l k=l 

which corresponds to the net increase in the number of installations for application i between 
times t — 1 and t. It can be expressed in terms of the individual constituent variables as shown, 
where the first sum is taken over all N individuals, whereas the latter sum is taken over potential 
new installers, with the subset of indices ji, j 2 , . . . , 3N-m{t) £ {1,2,..., N} such that S iy j k (t — 
1) = 0. In terms of the above analogy, once a user has installed a given application, he stops 
tossing the particular coin corresponding to that application. 

According to FS, the temporal average and standard deviation of /;(£) are related through the 
relationship <jj ~ fj,f. This motivates us to identify a region in which the relationship between 
log fik and log is linear. The value of the fluctuation scaling exponent a is given by the slope 
of the line. Although a lies in the rather narrow range [1/2, 1], its value is crucial as an indicator 
of statistical coupling in the system (Fig. 2A,B). If the behavior of a user is independent of the 
behavior of others, one would expect a = 1/2, whereas if her behavior is fully correlated with 



others one would expect a = 1 for all applications (33 1. 

As shown in Fig. 2C, applications with log(/ij) > \og(fx x ) w 0.36 define the collective 
regime governed by «c ~ 0.85, which indicates strong correlations among the constituent 
variables, i.e. the underlying "coin tosses". Application installations above this point are in- 
fluenced by the behavior of others. Unexpectedly and contrary to previous empirical studies 
of other systems (151), breakpoint analysis (see S4 in SI) shows that the system exhibits another 
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Figure 2: Fluctuation scaling (FS). (A) The concept of FS can be illustrated by considering 
tossing coins in two ways (|#|). (i) We toss a group of k coins independently with sides corre- 
sponding to and 1 and let fk equal their sum. (ii) We toss a single coin with sides and k, 
which corresponds to tossing k fully coupled coins. (B) We perform the experiment several 
times and calculate the average (fk) and standard deviation Ok of fk as shown in the schematic. 
In both cases (fk) ~ k, whereas Ok ~ Vk in (i) but Ok ~ k in (ii). Varying the value of k 
produces a series of points in the log /i^, logcr^ plane. From the FS point of view, this simple 
example resembles Facebook users making decisions on application adoption; the "coins" are 
now biased, reflecting individual heterogeneity, and the tosses are not independent but coupled 
via the local and global signals (see S2 in SOM). (C) Of the 2705 Facebook applications in 
the empirical data set, 2562 with /ij > and a, > are plotted here (see S3 in SOM). Two 
qualitatively different regimes emerge and they are separated by a cross-over point located at 
log fi x = 0.36. The first, individual regime is characterized by the exponent aj ~ 0.55, and the 
second, collective regime by « c w 0.85. (D) The synthetic data set consists of 2705 time series 
of which 2163 have \ii > and Oi > 0. We now obtain a single regime characterized by the ex- 
ponent as ~ 0.84. Note that in C and D the exponents lie between 1/2 and 1, corresponding to 
the extremes of completely uncorrelated and correlated decisions of users to adopt applications. 



qualitatively different regime for the less popular applications. This individual regime with 
log(yUj) < log([x x ) has ai ~ 0.55, which is very close to the limiting case of a = 1/2, meaning 
that applications installations are nearly uncorrelated and social influence is negligible. The 
transition between the two regimes takes place at approximately log(/i x ) = 0.36, which trans- 
lates into an average daily activity of 24 x 10 36 ps 55 new installations a day. We emphasize 
that theoretical considerations guided our choice to fit a linear function to the data in Fig. 2C 
as opposed to, say, trying to find the best fit among a class of curvilinear functions. While it 
would be interesting to resolve also the precise location and nature of the transition (sharp or 
continuous), we are unable to make this distinction on the basis of the empirical data. However, 
the central finding on the existence of two different regimes remains unaffected. 

The interpretation of FS exponents in terms of correlations assumes that the underlying 
stochastic processes is stationary (8). However, the fact that riiit) increases over time demon- 
strates that the system cannot be stationary. The question then becomes whether the system is 
sufficiently close to stationarity so that the fluctuation scaling exponents can be given the above 
interpretation. Let us impose the stringent condition that the system is sufficiently close to sta- 
tionarity when at most 1% of users have the application installed. We show in SI that even under 
this strict condition, 98% of the time series are stationary. This also means that the scaling in 
Fig. 2C holds for over two orders magnitude above the cross-over point. We conclude that the 
system is sufficiently stationary so that the temporal fluctuations may indeed be given the above 
interpretation. 

As a simple explanatory hypothesis for the observed behavior, one might suggest that the 
different scaling properties result from applications having different lifetimes. To test this, 
we divide the applications into three distinct groups based on their time of introduction such 
that each group covers an equally long time interval. We repeat the scaling plot by choosing 
randomly 300 applications from each group with the red, green, and blue colors indicating 
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whether the application was introduced in the first, second, or third interval (Fig. 3). Since any 
interval of x-values contains an approximately equal number of markers of different colors, the 
time of introduction and, hence, application lifetime, does not explain its scaling properties. 

Should the transition from one regime to the other be attributed to the popularity of appli- 
cations reaching a certain threshold value, or should it be attributed to the system itself? If 
the former is true, then one might think that the transition corresponds to a phase transition or 
to the crossing of an epidemic threshold, essentially a density threshold effect, resulting in an 
epidemic of popularity. To isolate the effects of popularity, we construct rank-order preserving 
synthetic time series from the empirical ones. This deterministic process (apart from ties) cuts 
the empirical time series into pieces and then recombines the pieces using a rank based rule (see 
Methods). As shown in Fig. 2D, the transition disappears for the synthetic data. Statistical tests 
also support the existence of a single regime (see SI) and, in addition, the Pearson's linear corre- 
lation coefficient between log a and log fj, is 0.99. The consequences of this are threefold. First, 
the lack of two regimes for the synthetic data demonstrates that the transition from one regime 
to the other is not a result of the popularity of an application exceeding a certain threshold, so 
the phenomenon is not analogous to crossing an epidemic threshold. Second, it demonstrates 
that the collective (correlated) regime does not result from the system becoming saturated with 
users of a given application that would then induce correlations between the behaviors of the 
individuals. This is because all the synthetic time series obey the same scaling relation also for 
small values of log(/i) (corresponding to dilute limit), where the system is far from being satu- 
rated. Third, the synthetic regime has an exponent a s ~ 0.84, which is very close to etc ~ 0.85 
that characterizes the collective regime for empirical data. This shows that we can recover the 
exponent of the collective regime by assuming that the future popularity of an application is 
driven by its current popularity, a finding that has also been used to predict popularity of online 



content (34). 
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Figure 3: Effect of application lifetime on scaling. Visual inspection shows that any interval of 
log /x-values contains a roughly equal number of red, green, and blue markers, indicating that 
the time of introduction and, hence, application lifetime, is not related to its scaling properties. 
The histograms at the bottom of the panel show exactly how many applications from each of the 
three periods (red, green, blue) fall in the [—2, —1), [—1, 0), [0, 1), [1, 2), and [2, 3) intervals, 
demonstrating clearly that there is no age trend in the scaling plot. 
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We have harnessed data on Facebook applications to study the role of social influence on 
the dynamics of popularity in an endogenous online system. The way the platform, Facebook, 
and the cultural products, Facebook applications, have been set up in this self-contained system 
guarantees that the agents are subject to both local and global signals of influence. We have 
shown here that the studied online system exhibits a collective and individual regime, and ar- 
gued that the emergence of the two regimes is an inherent property of the system. Since each 
regime is characterized by a single fluctuation scaling exponent, the strength of social influ- 
ence is approximately constant across each regime. Consequently, the extent of social influence 
becomes discretized: either there is virtually no influence or, alternatively, the strength of influ- 
ence is that given by the exponent of the collective regime. This suggests that social influence 
assumes a binary, on-off nature in the system. It is worth pointing out that had we only moni- 
tored the more successful (high /S) applications, we would have been able to observe only (part 
of) the collective regime. 

We believe that our finding on the existence of the two regimes may well generalize to other 
systems. The move of an increasing number of human activities to the online world has en- 
dowed users with the power of participation. Familiar examples include the online book retailer 
Amazon and the online DVD rental service Netflix, both of which allow their users to rate the 
products and, consequently, influence their future popularity. While some books and films in 
these systems are certainly highly advertised by their producers, they arguably stand for only a 
small fraction of the choices available, leaving a large majority of books and films exposed to 
endogenously generated social influence. Social influence may then emerge spontaneously in 
a wide range of online environments over and above purely endogenous systems. Whether it 
becomes discretized in these systems as well remains to be seen. 
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Methods 
Synthetic time series 

We construct rank-order preserving synthetic time series from the empirical time series in order 
to isolate the effects of popularity from other factors in the logo", log fj, plots. This process 
is deterministic (apart from ties) and essentially it cuts the empirical time series into pieces 
and then recombines the pieces using a rank based rule (see Fig. [4]). Let us denote the global 
ranking of application k at time t with r^it) E 1, . . . ,M such that nn.-i)(t) > n(k){t) > 
7i«. + ;n (t). We define fiiit) = hi(t — 1) + fi(t) analogously to what we had before, but now 
fi(t) = rik(t) — n k {t — 1) such that r k {t — 1) = i. Here is the number of new installations 
over a single time step for an application that at time t — 1 had ranking i. The series are seeded 
by setting nj(l) = 7i(j)(l) for alH = 1, . . . , M and are constructed using the above recursive 
relation for t > 2. 

The synthetic time series hi (t) by construction has a constant relative popularity as measured 
by the global rank order of hi(t) and, consequently, the future popularity of the synthetic time 
series is systematically driven only by its current popularity (rank). In the absence of rank 
crossings, the synthetic data would behave like empirical data. The increments fi(t) of the 
synthetic data result from a combined effect of both the local and global signals. The impact of 
the global signal remains constant since the synthetic time series hi(t) always holds rank i on the 
global "best seller" list. A single synthetic time series hi(t) is typically a combination of several 
empirical time series and, therefore, the local signal in the synthetic time series corresponds to 
a mean-field approximation of the local signals of the applications that make up the synthetic 
time series hi(t). 
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Figure 4: Schematic of the construction of the synthetic time series hi(t). (A) The empirical 
data consists of t = I, ... ,7 observations for three applications. The data points have been 
connected with dashed black lines to guide the eye. For the most popular application at time 
t — 1, the change in number of users between times t — 1 and t is indicated by the height 
of the vertical red bar at time t, which corresponds to fi(t) in the text. Similarly, f 2 (t) and 
f 3 (t) are indicated by the green and blue bars, respectively. An easy way to understand the 
process is first to compute the difference in the number of users for all applications given by 
fi(t) = rii(t) — rii(t — 1) and then color the difference based on rj(t — 1), the rank of the 
application at time t — 1. (B) The synthetic time series are seeded by the initial values taken 
from the empirical data such that rii(l) = n 2 (l) = n*(l), and n 3 (l) = n a (l) of the 

empirical data and they are constructed by adding together the difference bars of the same 
color. Overlapping bars have been shifted slightly horizontally for clarity of presentation. 
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Supplementary Information 

SI: Background to fluctuation scaling 

Fluctuation scaling (FS) was introduced well to a wider physics audience in a recent article by 
Eisler, Bartos, and Kertesz ([7]). In temporal fluctuation scaling (TFS), we start from a multitude 
of M time series measured in the interval [0, T] and assume that the constituents, i.e. the random 
variables making up the signal, are additive. The signals are divided into blocks of duration At, 
and for any block in the interval [t, t + At) the signal can be decomposed as 

f?\t) = E O). ( 2 ) 

n=l 

where Nf" t (t) is the number of constituents within the block, i.e. the number of random vari- 
ables Vf^(t) to be summed together, of signal i during [t, t + At). We assume that Vfif(t) > 0, 
so that the time average of /A*, denoted by (/ 4 Ai ), does not vanish. Is is defined as 

Q-l Q-iJVA*( 9 At) 
^ q=0 ^ q=0 n=l 

where Q = T/At. For any At the variance can be obtained as 

= (vm - (ft*)* ■ ( 4 ) 

This quantity characterizes the fluctuations of the activity of signal i from block to block. When 
/ is positive and additive, it is often observed that the relationship between the standard devia- 
tion <7j(At) and the mean (f^) is given by a power law 

a,{At) ex (f t ) aT , (5) 

where one varies i keeping At fixed. Note that the value of At does not affect the scaling as 
it can be absorbed in the proportionality constant. The exponent a T is in the range [1/2, 1] 
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and the subscript T indicates that the statistical quantities are defined as temporal averages to 
distinguish them from ensemble fluctuation scaling (TTl). 

In the paper, we discuss a more system specific form of fluctuation scaling using spin vari- 
ables Si >n (t) as constituent variables. Further, instead of having access to signals in continuous 
time, we consider, as a starting point, data sampled at discrete time intervals such that two con- 
secutive time points t and t + 1 are separated by St in physical time. The corresponding events 
in real physical time may have an arbitrary time resolution but, due to finite temporal sampling 
resolution, all events within one block may be considered concurrent. 

S2: Example of fluctuation scaling 

Let us consider a set of state or spin variables Sij(t) G {—1, 0, 1}, one for each application i 
of every user. Here Sij(t) = 1 corresponds to user n adopting application i at time t, Sij(t) = 
corresponds to there being no activity from user j regarding application % at time t, and 
Sij(t) = — 1 corresponds to user j dropping application i at time t. The FS exponent a can 
be interpreted in terms of correlations between the constituent variables, in this case the spin 
variables Sij(t). This leads to two limiting cases. If the constituent variables are uncorrelated, 
one obtains square-root scaling with a = 1/2, whereas if the constituent variables are fully 
correlated, one obtains a linear scaling with a — 1. 

Two simple examples will illustrate this interpretation. Consider a variable Sij(t) with 
the mean and variance of given by (Si) and S|., respectively. If the random variables Si t j(t) 
are independent and identically distributed for all n and t, we obtain by the linearity of the 
expectation operator E[-], taken over time, that 

N 



fi l = E[f i (t)}=E 



J=l 



NE[S i)j (t)] = N(S i ) (6) 
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The variance is given by 

N 



^ = Var [/,(*)] =Var 



E 5 «w 

.j'=i 



iWar^-^l^iVEl (7) 



since the variance of the sum of uncorrelated random variables (as follows from their indepen- 
dence) is the sum of their variances. Combining the expression for the mean and the variance 
gives of = (Hg./(Si))fM so that a = 1/2. The exponent a = 1/2 is then a consequence 
of the central limit theorem and is reminiscent of the 1 / \/N fluctuations of extensive quanti- 
ties, such as energy, in equilibrium statistical mechanics H\. On the other hand, if the random 
variables Sij(t) are completely correlated, such that = • • • = S^A^t), we can write 

J2f=i = NSi t i(t) which, as before, gives 

H i = NE[S i ,i(t)] = N(S i ) (8) 

but now 

a\ = Var [NS itl (t)\ = iV 2 Var [S^(t)} = N 2 E%, (9) 

resulting in a { = (S| / (Si)) fa so that a = 1. One way to produce a = 1 is by a global driving 
force that imposes strong fluctuations that dominate over the local dynamics of the system ([7]). 

S3: Stationarity of time series 

The fact that for most applications fait) is an increasing function of time suggests that the 
system is not stationary and, consequently, violates the assumption on stationarity. The question 
then becomes whether the system is sufficiently close to stationarity so that the fluctuation 
scaling exponents can be interpreted in terms of correlations among the constituent variables. 
We can write 

, Ti Ti Ti N 

in = m)) = ¥ E = t- E m*) !)] = r E E s *®> (10) 
1 t=i 1 t=i 1 t=i j=i 
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where the latter sum is taken over all A" Facebook users and we have used J2n=i ^j (^) = 
rii(t) — riiit — 1). Let us now assume that only irreversible Sij(t) = — > Sij(t + 1) = 1 
changes are possible. The validity of this assumption has mostly to do with the choice of the 
investigated time period. Facebook applications had just recently been introduced, there was 
less choice of and less competition between applications and, hence, dropping of applications 
was conceivably rather rare. Quantifying the extent of uninstallation of applications would, 
however, require access to the micro level data. 

Instead of letting the sum indexed by n in the equation run over the entire system (over 
all users), we construct a restricted sum consisting of those users only who have not adopted 
application i by the previous time step. This yields 

1 Ti N T z N-r H (t) 

1 t=i j=i 1 t=i k=i 

where the subset of indices ■ ■ ■ ,3N-m{t) £ {1, 2, . . . , N} such that S iy j k (t — 1) = 0. 

The non-stationarity of fi(t) is reflected in the fact that the number of terms in the above 
sum, iV — rii(t), depends on (typically decreases with) time. While this is true for almost every 
application, it may be a problem only for the highly popular applications, i.e. in the high density 
regime. Let us impose the stringent condition that the system is within the low density regime, 
corresponding to the set of applications for which fi(t) are sufficiently close to stationarity, 
when at most 1% of users have the application. Within this regime, the number of terms in 



the last sum of Eq. 11 is always between 0.99iV and iV and, consequently, it decreases only 
marginally and the time series can be taken to be sufficiently stationary. 

To see how far the low density regime extends, we set N—n* = 0.99iV, giving an upper limit 
n* = AT/100. The number of users at the end of the time period is rij(T) = rij(0) + /i«T ps [i{T, 
the approximation being rather good in the low density regime, and we can assume that the 
approximate stationarity holds throughout the time horizon for applications with n^T) < n*, 
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and setting n* = fx*T defines the low density regime as < // < fi* with /i* = iV/(100T). 
The stationarity can be expected to break down for applications with fii > fi* « 414 so that 
log(/i*) ~ 2.6. This means that, even under this relatively strict interpretation of stationarity, 
97.8% of the time series are stationary. This also means that the scaling in Fig. 2C holds for over 
two orders magnitude above the cross-over point ji x . We conclude that the system is sufficiently 
stationary so that the fluctuation scaling exponents for temporal fluctuations may be interpreted 
in terms of correlations between the constituent variables. 

We can also relax the assumption about having only irreversible S i: j(t) = — > S iy j(t + 1) = 
1 changes. Let S iy j(t) = 1 correspond to user j adopting application % at time t, Sij(t) = 
corresponds to there being no activity from user j regarding application % at time t, and S i: j (t) = 
— 1 corresponds to user j dropping application % at time t. Allowing Sij(t) = — 1 means that 
the value of (fi) may vanish or become negative. Of the M = 2705 applications analysed, 
2562 have positive ^ > 0, 5 have ^ = 0, and for 138 applications < 0. Combining 
these numbers, we can see that 95% of the temporal averages ^ are, in fact, positive and, 
consequently, non-negativity does not pose a problem. 

S4: Breakpoint analysis for linear regression 

Consider the linear regression model 

\ji = xJPi + Ui, i = 1, . . . , n, (12) 

where i/i is observation % of the dependent variable, x j is a k x 1 vector of regressors with the 
first component set equal to unity, and $ is a k x 1 vector of regression coefficients that may 
vary over time. The null hypothesis is that the regression coefficients remain constant 

H : Pi = f3 , i = 1, ... ,ra (13) 
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against the alternative hypothesis H\ that at least one of the coefficients changes. In general, 
if there are m breakpoints, the regression coefficients are constant within the resulting m + 1 
segments. The model can be rewritten to incorporate the breakpoints as 

Hi = xjfij + Ui, i = ij-i + j = 1, . . . , m + 1, (14) 

where {z 1; . . . , i m } are the set of breakpoints and j is the segment index. Conventionally i = 
and i m+ i = n. Breakpoints are typically not given exogenously but need to be estimated from 
the data. Finding breakpoints in data is also known as testing for structural change in data, and 
the are two frameworks for doing that: F-statistics and generalized fluctuation tests (2). Here we 
follow the F-statistics test that can be used to test against a single breakpoint, corresponding to 
the case with m = I'm the above framework, at an unknown observation i\ with segment j = 1 
covering observations % = 1, . . . , %\ and segment j = 2 covering observations i = i\ + 1, . . . , n. 
To identify the breakpoint ii, we compute a sequence of F-statistics for a change at observation 
i given by 

= u T u - u{i) T u{%) 
1 u(i) T u(i)/(n-2kY K ' 

where u are the ordinary least squares residuals from the unsegmented (no breakpoint) model 
and u(i) are the ordinary least squares residuals from a segmented model with a breakpoint at 
observation i and the regression is carried out separately for each segment ([2]). 

From the above definition it is clear that F is proportional to the residuals of the un- 
segmented model, u T u, and inversely proportional to the residuals of the segmented model, 
u(i)u(i) T . To ensure that each regression model can be estimated with a sufficient number of 
data points, we need to introduce a trimming parameter h such that we compute F { for a subset 
of i = h, h + 1, . . . , n — h observations. In practice, we can compute F for alH = 1, . . . , n 
and simply ignore the resulting values of F for very small and very large values of i, where 
a suitable value of h is chosen by the practitioner. The null hypothesis H is rejected if the 
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maximum value of F is "large" ([2]). What precisely it means for F to be large depends on the 
context. In any case, what matters is the relative height and narrowness of the maximum value 
of F with respect to all the other values; A peak that is high and narrow is stronger evidence of 
a structural change in data than a peak that is low and wide. 

The results are shown in Figure|5J The data have been sorted in ascending order based on the 
x-variable such that /j,m < /i( 2 ) < • • • < /irju)- In the case of the empirical data, the F-statistic 
behaves smoothly and develops a clear maximum. This is strong evidence of there being a 
structural change in the data such that the two regimes to the left and right of the breakpoint are 
governed by different exponents, ai ~ 0.55 and «c ~ 0.85, respectively. 

The behavior of the F- statistic for the synthetic data, however, is qualitatively very different. 
Instead of a smooth, single maximum, the error landscape is more rugged, and the maximum 
appears to be degenerate. Strictly speaking, there is a single maximum at Fa* ~ 186 for obser- 
vation k = 562, corresponding to log(/Z( 562 )) ~ —0.38, but there is also a secondary maximum 
for k ~ 1800. The lack of a clearly defined maximum suggests that there is no sufficient sta- 
tistical evidence to introduce a breakpoint in the data. Note that the above framework does not 
allow introducing multiple breakpoints. While this could be done in principle by adding more 
degrees of freedoms (more parameters), it becomes exceedingly difficult to justify them, espe- 
cially if the differences in the slopes are very small. To demonstrate this, consider accepting the 
view that there is, in fact, a legitimate breakpoint at k = 562 in the synthetic data. This results 
in two exponents a « 0.84 and a ~ 0.87, which are so close to one another that it is difficult 
to justify theoretically their slightly different values. We conclude, given these considerations, 
that the behavior of the synthetic data is governed by just a single exponent a s ~ 0.84. 
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Figure 5: F-statistics for breakpoint analysis. (A) The F-statistic is smooth and well-behaved 
for the empirical data and reaches a maximum of Fa.) ~ 1035 for observation k = 1759. This 
maximum corresponds to a breakpoint at log( / U( 1759 )) ps 0.36 and it separates the data into two 
regimes characterized by exponents ai ~ 0.55 and olq ~ 0.85 to the left and right of the point, 
respectively. (B) The F-statistic for the synthetic data is very rugged and the resulting maxi- 
mum is in practice degenerate. This irregular behavior of the F-statistic violates the underlying 
assumption of having a well-defined maximum and, consequently, does not provide sufficient 
statistical evidence for introducing a breakpoint in the data. 
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Figure 6: Total activity F(t) as a function of time t, where a unit of time is one observation, 
corresponding to calendar time from June 25, 2007 to August 14, 2007. 

S5: Supporting data analysis 

Let us define the total activity as F(t) = £\ fi(t), where the sum runs over all applications 
that are in existence at time t. The total activity F(t), which is not to be confused with the F- 
statistic in Section S4, corresponds to the total number of applications installed in the one-hour 
interval between t and t—1. We show F(t) in Fig.[6| where the daily 24-hour period of activity 
is clearly visible. 

It is possible that, for a given application, the mean and standard deviation of activity fi 
result from the application being at a certain stage of its lifetime. Consequently, given that we 
have a mixture of old and new applications, if the scaling of standard deviation of fi with the 
mean of fi were dependent on the age of the application, this could in principle contribute to the 
cross-over reported in the main text. To test this hypothesis, we define the time shifted activity 
for application i as ft(r) = fi(U + r) with r > 0, where ti is the (approximate) introduction 
time of application i. The time-shifted aggregated numbers n^ti + r) are shown in the upper 
panel of Fig.|7} and the time-shifted activities gi(r) are in the lower panel. We can now compute 
the mean and standard deviation of the time-shifted activity gi(r) by truncating the time series 
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Figure 7: Time-shifted aggregate numbers of application installations 7ij(£j + r) as a function 
of application lifetime r (top) and the related time-shifted activity values <7i(r) (bottom). For 
purposes of visualization, in both plots are included a subset of 100 applications. 
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at r, i.e. by taking the first r points of the time series. We define an ensemble average of the 
time-shifted activities taken over all N(r) applications that have a lifetime of at least r as 



Similarly, we can define the ensemble average of the standard deviation of the time-shifted 
activities as 



where (gi(r)) = (1/r) Y^t=i9i{^)- We plot the standard deviation h{j) versus the mean g{r) 
for a number of different truncation points r £ 50, 60, ... , 1000 in Fig. [8j A linear fit describes 
their dependence very well, and demonstrates that the relationship between the mean and the 
standard deviation for the ensemble of applications does not depend on the age of the applica- 
tion, i.e. the stage the application in its lifetime. The fact that the dependence of h{r) on g(r) 
holds throughout the measured lifetime of applications demonstrates that the cross-over in the 
fluctuation scaling plot in the main text cannot be explained by having a mixture of applications 
that are at different stages of their lifetime. Finally, we repeat the fluctuation scaling plot in 
Fig. [9j this time using only applications that have lifetimes 50 < Tj < T such that they were 
introduced during during the first T — 50 time steps, corresponding to ti E [0, T — 50], such 
that for each application we have at least 50 points for estimating the first and second moments. 
The result is essentially identical to the one presented in the main text. In particular, the high 
ji applications are still present, as is the cross-over (fits not shown). This demonstrates explic- 
itly that the high /i regime is not simply produced by applications that have a large number of 
installations for t < 0, i.e. before the start of data collection. 
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Figure 8: Ensemble average of the standard deviation of the time-shifted activity, h(r), 
has a fixed dependence on the ensemble average of the time-shifted activity, g(r), through- 
out the lifetime of applications. The different points correspond to different values of r G 
50, 60, 70, ... , 1000 such that increasing the value of r leads to increasing values of g(r). The 
values of r start at 50 since we required that for each application there should be at least 50 
points in the time series in order to estimate its first and second moments sufficiently accu- 
rately. 
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Figure 9: Fluctuation scaling plot for activity fa using only those applications i that were intro- 
duced during the studied time period and for which we had at least 50 time steps worth data. 
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